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ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION SYSTEM AND 
METHOD UTILIZING WHITENING FILTER 
FOR UPDATING OF PREDICTOR COEFFICIENTS 

5 CROSS REFERENCE TO RELATED APPLICATIONS 

[001] The present application claims the benefit of priority from U.S. Provisional 
Patent Application No. 60/183,280, entitled "Adaptive Differential Pulse Code 
Modulation System and Method Utilizing Whitening Filter For Updating Of 
Predictor Coefficients" filed on February 17, 2000, which is incorporated by 
10 reference herein. 


O Field of Invention 


BACKGROUND 


[002] The present invention relates generally to encoding and decoding of 
15 digital audio signals, and more particularly to predictor adaptation in adaptive 
differential pulse code modulation (ADPCM) systems. 
Description of the Prior Art 

[003] FIG. 1 may be referenced in conjunction with the following discussion. 
ADPCM is a well-known technique for encoding speech and other audio signals 
20 for subsequent transmission over a network. A standard implementation of such 
a system is described in the International Telecommunication Union (ITU-T) 
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Recommendation G.722, 7 kHz Audio-Coding Within 64 kBit/s, which is 
incorporated by reference herein. 

[004] As described in U. S. Pat. No. 4,317,208, issued February 23, 1982 to 
Araseki et al. and incorporated by reference herein, a differential pulse code 
modulation system is a band compression system in which a prediction of each 
signal sample at a present time period is based on signal samples at past time 
periods. Such a process is particularly effective with voice and similar band 
signals due to their high degree of correlation between successive signal samples. 
A predicted signal Sj at a time j is typically derived at a transmitter 102 by the 
general equation: 

Sj = AiSj-i + A 2 Sj-2 + ... AnSj- n ; 
[005] where Ai, A2, . . . A n are termed the prediction coefficients. The prediction 
coefficients are selected to minimize the difference between an input signal Yj 
and the predicted signal Sj, thus minimizing a prediction error Ej which is in turn 
quantized and transmitted to a receiver 104, thereby requiring significantly less 
transmission bandwidth than would the input signal. The receiver 104 works in 
a manner generally the reverse of the transmitter 102, thereby reconstructing the 
input signal. 

[006] The characteristics of a voice or related audio signal vary with time, 
consequently the optimum coefficient values also vary. One method of 
attempting to efficiently derive prediction coefficients is to adapt them with the 
goal of minimizing the prediction error Ej while such error is being observed, 
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which could generally describe an ADPCM system. A common type of predictor 
employed in these systems is a pole-based predictor, such as predictors 110 and 
126, which utilizes a feedback loop to minimize the energy in the prediction error 
signal Ej, which is sometimes referred to as the difference or residual signal. 
5 [007] Due to the reality of frequent transmission errors between the transmitter 
102 and the receiver 104, the prediction errors fij (which have been inverse 
quantized) produced at the receiver 104, and thus the reconstructed input signal 
Sj depending thereon, has a tendency to diverge from the real input signal Yj 
received at the transmitter 102. To gradually eliminate the adverse effect of the 
10 transmission errors, the prediction coefficients are typically derived by the 
general equation: 

Af +l = A/(l-5) + g-F l (S' j „ i )-F 2 (E J ) } 

[008] where j=l to n, 8 is a positive value much smaller than 1, g is a proper 
positive constant, Sj-i is a reconstructed input signal delayed i samples, and Fi 
15 and F2 are non-decreasing functions. The receiver 104 prediction coefficient 
values are tracked, or gradually caused to converge to those of the transmitter 
102, by operation of the term (1-5). The detrimental effect of transmission errors 
is thus partially overcome. 

[009] Instability or oscillation of the receiver may still occur in pole-based 
20 predictor systems due to the feedback loop to the predictor, which uses the 
prediction error signal fej and the preceding reconstructed input signal to 
derive the prediction coefficients as described above. Stability checking is often i 


used to ensure that the prediction coefficients remain in desired ranges, but at 
the expense of increased complexity as the number of poles, i.e., coefficients, 
increases. 

[0010] In U. S. Pat. No. 4,317,208, Araseki et al. describe a system that also 
employs zero-based predictors, such as predictors 120 and 128, which do not 
utilize a feedback loop but which are known to provide less predictor gain than 
pole-based predictors and consequently inhibit or slow down the adaptation 
process. They propose that such a combination of pole-based and zero-based 
predictors may overcome the instability issues described above, and gain the 
advantages of each type of predictor. 

[0011] In U. S. Pat. No. 4,593,398, issued June 3, 1986 to Millar and incorporated 
by reference herein, it is suggested that a pole-based predictor, even coupled 
with a zero-based predictor, is still vulnerable to mistracking if the input signal 
contains two tones of equal amplitude but different frequencies. Millar notes 
that certain input signals may cause the pole-based predictor adaptation driven 
by the feedback loop to have multiple stable states, thus the receiver 104 may 
stabilize with its prediction coefficients at values different than the transmitter 
102. This in turn is likely to cause a distorted frequency response at the receiver 
104 and its associated audio output device. 

[0012] The Millar patent proposes to mitigate the problems associated with lower 
predictor gain in zero-based predictors and mistracking in pole-based predictors. 
The system described by Millar and depicted in FIG. 1 is such that the predictor 


means in the transmitter 102 and the receiver 104 derive the prediction 
coefficients based on an algorithm including a non-linear function having no 
arguments comprising the value of the reconstructed input signal, such as signals 
Sj and Sj-u This coefficient adaptation is depicted by arrows 119 and 127. This is 
in contrast to the Araseki system wherein the prediction coefficients are partially 
derived from a reconstructed input signal such as signal which is dependent 
upon the predicted signal Sj, which is dependent upon all of the immediate past 
coefficient values. 

[0013] It is postulated that the Millar system and method may be 
computationally expensive to implement. Therefore what is needed is a system 
and methods in which the convergence to the optimal prediction coefficients, 
and thus to the predicted signal Sj, occurs more rapidly and efficiently than in 
prior art systems. 


SUMMARY 

[0014] An improved adaptive differential pulse code modulation (ADPCM) 
system and method comprises an encoder and a decoder linked together by a 
network connection and configured for processing digital audio signals. More 
particularly, the technique described is related to adaptation of predictor 
coefficients in an ADPCM environment. The components of the system may be 
implemented in software form as instructions executable by a processor or in 
hardware form as digital circuitry. Furthermore, devices implementing the 
system and method described are preferably configured to include both an 
encoder and a decoder for bi-directional communication with a similarly situated 
remote device, or may be configured with solely the encoder or decoder. 
[0015] At the encoder, a digitized input signal is applied to a subtractor, which 
derives a difference signal by subtracting from the input signal a predicted signal 
generated by a pole-based predictor. After quantizing, transmitting to a decoder, 
and inverse quantizing, the difference signal is added to the predicted signal by 
an adder to provide a reconstructed input signal, which is fed back to the 
predictor and to the subtractor. The encoder is additionally provided with a 
whitening filter for receiving the reconstructed input signal and applying thereto 
a filtering algorithm to generate a filtered reconstructed signal. The filtered 
reconstructed signal is utilized to update, or adapt, the prediction coefficients of 
the pole-based predictor, thus providing more rapid and computationally 
efficient convergence to optimal prediction coefficients. 


[0016] The decoder operates in an inverse manner to the encoder, receiving the 
quantized difference signal from an encoder and processing it to reconstruct the 
input signal for delivery to sound reproducing means. It is noted that devices 
employing the ADPCM techniques described herein are interoperable with 
devices employing prior art techniques, for example, those described in ITU-T 
G.722. It is further noted that the techniques described herein may be adapted 
for various implementations, one example being the employment of a plurality 
of encoders and/ or decoders for frequency sub-band processing. 
[0017] Other embodiments of the invention comprise additional predictors at the 
encoder and the decoder, operating to maximize the signal-to-noise ratio for 
certain input signals. The additional predictors are preferably zero-based 
predictors, the output therefrom being summed with the pole-based predictor 
output to produce the predicted signal. 


BRIEF DESCRIPTION OF THE FIGURES 


[0018] In the accompanying drawings: 

[0019] FIG. 1 depicts a prior art ADPCM system; 

[0020] FIG. 2 depicts an ADPCM system, in accordance with a first embodiment 
of the invention; and 

[0021] FIG. 3 depicts another ADPCM system, in accordance with a second 
embodiment of the invention. 
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DESCRIPTION OF PREFERRED EMBODIMENTS 
[0022] FIG. 2 depicts a first embodiment of an ADPCM system 200 in accordance 
with the invention. ADPCM system 200 comprises an encoder 202 and decoder 
204 linked in communication by a network connection 206, such as an ISDN line, 
5 fractional Tl line, digital satellite link, wireless modems, or like digital 
transmission service. At encoder 202, a digitized input signal, typically 
representative of speech, is applied to a conventional subtractor 208. The input 
signal is represented as Yj, signifying a value at sample period j. Subtractor 208 
derives a difference signal Ej by subtracting from input signal Yj a predicted 
10 signal Sj generated by a pole-based predictor 210. The difference signal Ej is 
quantized by a conventional quantizer 212 to obtain a quantized numerical 
03 representation Nj for transmission to decoder 204 over the network connection 

S3 

O 206. Quantizer 112 is preferably of the adaptive type, but a quantizer utilizing 

n i 

\f* fixed step sizes may also be used. 

15 [0023] Numerical representation Nj is also applied to a conventional inverse 

quantizer 214, which derives a regenerated difference signal Dj. A conventional 
adder 216 adds regenerated difference signal Dj to a predicted signal Sj (output 
by the pole-based predictor 210) to provide a reconstructed input signal Xj. The 
reconstructed input signal Xj is in turn applied to the pole-based predictor 210, 
20 which calculates the predicted signal Sj in accordance with the following 
equation: 


fn 
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S J =ajS H +aJS J _ 2 +...+a l B S >H1 

where Sj-i is a stored value of the predicted signal at sample period j-1, Sj-2 is a 
stored value of the predicted signal at sample period j-2, and so on, and ai) to ad 
are the predictor coefficients at sample period;', where n corresponds to the total 
number of poles (i.e., coefficients) of pole-based predictor 210. In one 
implementation of ADPCM system 200, the pole-based predictor 210 is limited to 
two poles, yielding the relation: 

Sj = a}Sj_, + a 2 Sj_ 2 . 

[0024] The predicted signal Sj generated by predictor 210 is then applied to adder 
216, completing the feedback loop. 

[0025] Predictor coefficients aii and a$ are updated in accordance with the 
generalized equations: 

a|* , =aJ(l-8 1 ) + g 1 -F,(X j f ,X^,Xj. 2 ) 

a 2 +1 =a J 2 (l-8 2 ) + g 2 -F 2 (X J f ,x;. I ,x;. 2 ,x;_ 3 ,a{) 

where X f j is a filtered version of reconstructed input signal Xj at sample period;; 
8i, 82, gi and g2 are proper positive constants, and Fi and F2 are nonlinear 
functions which may consist of correlations, sign-correlations, or other 
relationships. Calculation of the filtered reconstructed signal X f j is discussed 
below. 

[0026] In general, whitening filters modify the spectrum of signals to provide a 
flatter signal spectrum, so that there is less variation of energy as a function of 
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frequency. It is noted that a perfect white noise signal has equal energy at every 
frequency. Stochastic gradient adaptive filters generally converge more rapidly 
with white signals than with non-white signals. Therefore, the use of a 
whitening filter in the present system and method is preferred at least for its 
5 effect on convergence of the adaptive pole-based predictors 210 and 226. 

[0027] Referring back to FIG. 2, a whitening filter 218 receives the reconstructed 
input signal Xj and applies thereto a filtering algorithm to generate a filtered 
reconstructed signal X f j. To ensure stable operation of whitening filter 218, the 

filter coefficients aJ J+1 and a[ J+! undergo the clamping set forth below at every 
10 other time step (i.e., for odd values of;): 

[0028] a!| J+1 is clamped to a maximum of 12288 and a minimum of -12288; and 

a[ J+I is clamped in magnitude to 15360 - . 
Implementation of this clamping routine is exemplified as: 

temp = 15360 - a£ J+l ; 

15 if af j+1 > temp, then a[ J+1 is set to equal temp; 

if af J+I < -temp, then a[ J+1 is set to equal -temp. 
[0029] The filtered reconstructed signal X f j output by whitening filter 218 is 
utilized to update the predictor coefficients aii +1 and a^ 1 , as described above and 
indicated on FIG. 2 by arrow 220. 
20 [0030] According to a preferred implementation, whitening filter 218 has two 
zeroes, yielding the relation: 
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Xj =X j -a{X j . x -a{X J _ 2 
where a f i and a f 2 are the first and second order filter coefficients. The filter 
coefficients a f i and a f 2 are updated at each time step j in accordance with the 


following equations: 


fj+i fj 
a 2 =a 2 


( ( 256 11 


I 32768 
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tf j sgn[x; ]sgn[x;_, ]+ 128 * sgnfxj ]sgn[x^ 2 ]; and 


a[ j+1 = a[ j 


1- 


( 128 11 
1,32768; 


+ 192*sgn[xj]sgn[xj_,]; 


where sgn [ ] is the sign function that returns a value of 1 for a nonnegative 
argument and a value of -1 for a negative argument. 

[0031] In accordance with a computationally economical implementation of 
10 ADPCM system 200, the values of the predictor coefficients may be frozen at 
every other sample interval /'. It should be noted that by recalculating predictor 
coefficients for pole-based predictor 210 only at every other interval, 
computational resources are conserved. This implementation is described by the 
following equations: 
15 for even;': 

a 2 +1 =a J 2 ;and 
af' =a?; 


else for odd;: 
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+ 191 .25 * sga[x]_ } ^gn[x;_ 2 ] + 192 * sga[x] Jgn[x^ ]; 

where sgn [ ] is the sign function that returns a value of 1 for a nonnegative 
argument and a value of -1 for a negative argument, and 

limfaf'^af 1 for -8192 <a^ < 8191; 

lim[ar 1 ] = -8192 for af 1 < -8192; and 
lim^^^S^l for af 1 > 8191. 

[0032] To ensure stability, a2) +1 and aii +1 are clamped similarly to af J+1 and af 3+1 as 
described above. That is: 

a2) +1 is clamped to a maximum of 12288 and a minimum of -12288; and 

aii +1 is clamped in magnitude to 15360-a 2 7+1 . 
Implementation of this clamping routine is exemplified as: 

temp = 15360 - a 2 i +1 ; 

if aii +1 > temp, then aii +1 is set to equal temp; 

if aii +1 < -temp, then aii +1 is set to equal -temp. 
[0033] Decoder 204 operates in an inverse manner to encoder 202. Inverse 
quantizer 222 receives the numerical representation Nj over network connection 
206 and derives the regenerated difference signal Dj. Adder 224 sums the 
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regenerated difference signal Dj with the predicted signal Sj generated by pole- 
based predictor 226 to produce the reconstructed input signal Xj. The 
reconstructed input signal Xj is then delivered to sound-reproducing means 
(which will typically include a D/ A converter and loudspeaker) for reproduction 
of the speech represented by the input signal Yj. 

[0034] At the decoder 204, the reconstructed input signal Xj is additionally 
applied to whitening filter 230 and pole-based predictor 226. Pole-based 
predictor 226 operates in a substantially identical manner to pole-based predictor 
210 of encoder 202 and generates as output predicted signal Sj, which is applied 
to adder 224 to complete the feedback loop. Whitening filter 230, which operates 
in a substantially identical manner to whitening filter 218 of encoder 202, 
provides as output a filtered reconstructed signal X f j for use by pole-based 
predictor 226 in updating the predictor coefficients, as discussed above and 
indicated on FIG. 2 by arrow 228. 

[0035] Those skilled in the art will recognize that the various components of 
encoder 202 and decoder 204 will typically be implemented in software form as 
program instructions executable by a general purpose processor. Alternatively, 
one or more components of encoder 202 and/ or decoder 204 may be 
implemented in hardware form as digital circuitry. 

[0036] Additionally, those skilled in the art will recognize that, although the 
pole-based predictors 210 and 226 are described above in terms of a two-pole 
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implementation, the invention is not limited thereto and may be implemented in 
connection with pole-based predictors having any number of poles. 
[0037] It is additionally noted that the ADPCM technique embodied in the 
invention may be adapted in various well-known ways in 01: ;r to improve the 
speed and performance of the encoding and decoding processes. For example, a 
transmitting entity may break the input signal into a plurality of frequency- 
limited sub-bands, wherein each sub-band is applied to a separate encoder 
operating in a substantially identical manner to encoder 202. The sub-banded 
encoded signals are then multiplexed for transmission to a receiving entity over 
the network connection. The receiving entity then demultiplexes the received 
signal into a plurality of sub-banded signals and directs each sub-banded signal 
to a separate decoder operating in a manner substantially identical to decoder 
204. The sub-banded reconstructed signals are thereafter combined and 
conveyed to sound-reproducing means. 

[0038] In other embodiments of the invention, additional predictors may be 
combined with the pole-based predictors to maximize the signal-to-noise ratio 
for certain input signals. Referring now to the FIG. 3 embodiment of an ADPCM 
system 300, encoder 302 differs from encoder 202 of the FIG. 2 embodiment by 
the addition of a conventional zero-based predictor 306. Zero-based predictor 
306 receives the regenerated difference signal Dj and produces a zero-based 
partial predicted signal S, 2 , which is added to the partial pole-based predicted 
signal Sj P (equal to S, in the FIG. 2 embodiment) by adder 308 to provide 
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predicted signal Sj. Predicted signal Sj is in turn applied to the feedback loop of 
pole-based predictor 210 and to subtractor 208. It is noted that zero-based 
predictor 306 does not have a feedback loop, and its predictor coefficients are 
conventionally updated with dependence on regenerated difference signal Dj. 
5 [0039] Similarly, decoder 304 differs from decoder 204 of the FIG. 2 embodiment 
by the inclusion of zero-based predictor 310. The regenerated difference signal 
D, is applied to zero-based predictor 310, which generates as output a zero-based 
partial predicted signal Sj Z . Adder 312 combines the zero-based partial predicted 
signal Sj Z with pole-based partial predicted signal Sj P provided by pole-based 
10 predictor 226 to produce the predicted signal Sj. 

[0040] Another embodiment of the invention utilizes at least one look-up table in 
£3 determining the proper coefficients for the predictors, i.e., pole-based predictors 

O 210 and 226 of FIGs. 1 and 2, and/ or zero-based predictors 306 and 310 of FIG. 3. 

For example, the first pole-based predictor coefficient is a function of three 
15 quantities: its former value, the sign of the current value of the sum of the 

quantized prediction error plus the all-zero predictor, and the sign of the past 
value of the sum of the quantized prediction error plus the all-zero predictor. In 
this embodiment, no arithmetic is necessary in determining a prediction 
coefficient value, however, identical input-output characteristics of the predictors 
20 are preserved. 

[0041] It should be appreciated that devices utilizing the above-described 
ADPCM techniques, such as audioconferencing or videoconferencing endpoints, 
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will typically be equipped for bi-directional communications over the network 
connection, and so will be provided with both an encoder (such as encoder 202 
or 302) for encoding local audio for transmission to a remote endpoint as well as 
a decoder (such as decoder 204 or 304) for decoding audio signals received from 
the remote endpoint. 

[0042] It is further noted that devices employing the above-described ADPCM 
techniques of the invention are advantageously interoperable with devices 
employing some prior art ADPCM techniques, such as those described in the 
aforementioned Millar reference and the ITU-T G.722 reference. 
[0043] Finally, it is generally noted that while the invention has been particularly 
shown and described with reference to preferred embodiments thereof, it will be 
understood by those skilled in the art that various changes in form and details 
may be made therein without departing from the spirit and scope of the 
invention. 


